Introduction

With our Earth’s tectonic plates ever-shifting, earthquakes have occurred as a result, which can have immediate and long-term impacts on people’s life. Magnitude, as one of the determining factors of the earthquake level, may indicate how much energy is released during the earthquake. The magnitude of the earthquake might fluctuate depending on the level of the tectonic plate impact [1]. Just as the structure of the Earth is changing, earthquakes may also be changing at different rates or in different ways at various locations in the world.

Some studies show that large earthquakes greater than 8.0 in magnitude have increased slightly since 2004 [2]. Hence, we are wondering if the average magnitude of the significant earthquake also increases statistically significantly before and after 2004. In this report, we are looking at the potential relationship between the magnitude of the earthquake from the time frame 1965-2003 in comparison to 2004-2016. We are hypothesizing that the earthquakes’ average monthly magnitude from 1965 to 2003 is statistically significantly different from the average monthly magnitude from 2004 to 2016.

Background

We are examining earthquakes, ranging from 1965 to 2016 with a magnitude of 5.5 or higher, that were recorded by The National Earthquake Information Center (NEIC). This data was made available to us thanks to the seismographic network that recorded the vibration energy released by the ground at each seismic station.The database is available for free downloading on Kaggle [3].

Animation shows how the magnitude of significant earthquakes changed throughout the world from 1965 to 2016

Data

Summary of average monthly magnitude of Earthquake from 1965 to 2003 data set:
Summary of average monthly magnitude of Earthquake from 1965 to 2003 data set
year Magnitude count
Length:468 Min. :5.619 Min. :12.00
Class :character 1st Qu.:5.825 1st Qu.:27.00
Mode :character Median :5.875 Median :34.00
NA Mean :5.887 Mean :35.45
NA 3rd Qu.:5.935 3rd Qu.:42.00
NA Max. :6.315 Max. :83.00
Summary of average monthly magnitude of Earthquake from 2004 to 2016 data set:
Summary of average monthly magnitude of Earthquake from 2004 to 2016 data set
year Magnitude count
Length:156 Min. :5.700 Min. : 23.00
Class :character 1st Qu.:5.827 1st Qu.: 34.00
Mode :character Median :5.874 Median : 40.00
NA Mean :5.875 Mean : 43.71
NA 3rd Qu.:5.919 3rd Qu.: 48.00
NA Max. :6.102 Max. :227.00

The data set comes from the National Earthquake Information Center (NEIC), an organization that collects data about earthquakes worldwide. They share this data with international organizations so that they too can utilize it for their research and/or predictions. Countries, where the data is collected for the set, include – but are not limited to – The United States, United Kingdom, Ghana, and Japan. The data set doesn’t explicitly label the countries, but rather the data set acknowledges the longitude and latitude – which then, with a large enough sample size, will accurately show which countries are present. In our project, we eliminated longitude and latitude variables in order to refine the size of the data set.

Overall, the data set includes 23412 earthquakes from 1965 to 2016 (19591 from 1965 to 2003 and 6818 from 2004 to 2016). We will be grouping earthquakes by month and average magnitude to ensure the size of the data set is small enough for analysis: total: 625 earthquakes (468 earthquakes from 1965 to 2003, and 156 earthquakes from 2004 to 2016). The data is comprehensive, as there is a large enough duration (1965-2016) and a large enough sample size (worldwide data). It is also clear what determines which data is included or not. Since the data set only records the magnitude of significant earthquakes over the world, earthquakes with minor magnitudes (less than 5.5) are not included in the data set.

Variables

Key Variables from the Significant Earthquake Data set
Name Description
Year The year shows which year the earthquakes happened. It is shown in the dbl format.
Month The month shows which month the earthquakes happened. It is shown in the dbl format.
Count The count summarizes the number of Significant earthquake in the month
Magnitude The magnitude represents the average size of the Earthquake of that month

The rest of the report will examine if the average monthly significant earthquake magnitude from 1965 to 2003, has a statistically significant difference from the average monthly significant earthquake magnitude from 2004 to 2016.

Analysis

Graphic of comparison between average monthly magnitude from 1965-2003 and after 2004

Here is a graph that illustrates the change in average monthly significant earthquake magnitude over world from 1965 to 2003:

**Mean Monthly Magnitude of Earthquake over World from 1965 to 2003**. The red line connects the annual mean earthquake magnitude (black points). The blue curve shows a smooth trend through the data using local regression method. The green dashed line shows the mean of the average monthly magnitude over years

Mean Monthly Magnitude of Earthquake over World from 1965 to 2003. The red line connects the annual mean earthquake magnitude (black points). The blue curve shows a smooth trend through the data using local regression method. The green dashed line shows the mean of the average monthly magnitude over years

We see that the average monthly earthquake magnitude has gradually decreased from 1965 to 1970. After that, the magnitude decrease rate suddenly increases from 1970 to 1980. After 1980, the magnitude slightly increase until 1990 and then remained stable at around magnitude 5.87.

Here is a graph that illustrates the change in average monthly significant earthquake magnitude over world from 2004 to 2016:

**Mean Monthly Magnitude of Earthquake over World from 2004 to 2016**. The red line connects the annual mean earthquake magnitude (black points). The blue curve shows a smooth trend through the data using local regression method.The green dashed line shows the mean of the average monthly magnitude over years

Mean Monthly Magnitude of Earthquake over World from 2004 to 2016. The red line connects the annual mean earthquake magnitude (black points). The blue curve shows a smooth trend through the data using local regression method.The green dashed line shows the mean of the average monthly magnitude over years

We see that the average monthly earthquake magnitude has gradually increased from 2004 to 2008. After that, the magnitude remained stable at around magnitude 5.87 until 2016.

Since the mean of the data set from 1965 to 2003 and the mean of the data set from 2004 to 2021 is close (5.887 and 5.874). By comparing the trend from 1965 to 2003 and the trend from 2004 to 2016, it is hard to tell if there is statistically significant difference of average monthly magnitude between two data sets.Hence, two independent sample t test is required to test if there is statistically significant difference of average monthly magnitudes.

Independent Two Sample t Test

Estimates from Summary Data

Important values from two sample t test
year n mean sd
1965-2003 468 5.887294 0.1042034
2004-2016 156 5.875401 0.0757516



Confidence Interval for \(\mu 1\)-\(\mu 2\) & Two Sided Hypothesis Test

\(H0\) : \(\mu 1\) = \(\mu 2\)

\(H_a\) : \(\mu 1\) \(\neq\) \(\mu 2\)

Perform a two sample t-test to find the confidence interval:

Important values from two sample t test
t degree_of_freedom p_value lower_bound upper_bound
1.5356 364.12 0.1255 -0.0033372 0.027124



Equation for Two Sample t-test for Confidence Interval

\[ \mu_1-\mu_2\pm t\times\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}} \]

\[ \mu = sample mean\\ t = t value\\ n = sample size \\ s = sample standard deviation \]

Applied to our data, we get the following formula: \[ 5.887294-5.875401\pm1.966518\times\sqrt{\frac{0.1042034^2}{468}+\frac{0.07575158^2}{156}}\approx -0.003337342, 0.0271242 \]



Equation for Two Sample t-test for T Statistic

To calculate the T-statistic, we use the following formula: \[ \frac{\mu_1-\mu_2}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}\\ \] Applied to our data, we get the following formula: \[ \frac{(5.887294-5.875401)}{\sqrt{\frac{0.1042034^2}{468}+\frac{0.07575158^2}{156}}}\approx1.53562 \]


p Value Visualization

The evidence is consistent with there being no difference in the mean monthly average earthquake magnitude between 1965 to 2003 and 2004 to 2016. (p=0.1255, two-sided t-test, df=364.12).

Discussion

From the two-sample t-test, we can interpret that we are 95% confident the mean monthly magnitude of earthquakes from 1965 to 2003 is between 0.0033 lower and 0.027 higher than the mean monthly magnitude of the earthquake from 2004 to 2016.The confidence interval spans over 0 which helps prove the timeframe 2004-2016 is less likely to have a statistically significant difference compared to the timeframe 1965-2003. In addition, we can see the p-value calculated is 0.1255 which is higher than the \(\alpha\) value: 0.05. We can not reject the null hypothesis: \(H0\) : \(\mu 1\) = \(\mu 2\). Hence, there is not enough evidence to show that the monthly mean magnitude of the significant earthquakes from 1965 to 2003 has a statistically significant difference compared to the monthly mean magnitude of significant earthquakes from 2004 to 2016.

We sorted the data and used the mean monthly earthquake magnitude instead of all the earthquake magnitude data to decrease the size of the data set from 23412 samples to 624 samples. Hence, some degree of freedom is lost during data transformation which might affect the outcome of the two-sample t-test. Based on our finding that there is no statistically significant difference in average monthly earthquake magnitude between 1965 to 2003 and 2004 to 2016. It is reasonable to raise a new question that could be investigated in the future: Is there any monthly trend among the average monthly earthquake magnitude? (Which month has the highest monthly earthquake magnitude, and which one has the lowest?)

Furthermore, in the future, to test if there is a year trend in the monthly average earthquake magnitude, we can perform a correlation analysis to test if there is a correlation between year and monthly average earthquake magnitude. Our data set covers all the significant earthquakes from 1965 to 2016. In the future, we can collect additional data on the magnitude of significant earthquakes from 2017 to 2022 in order to refine and make the conclusion more persuasive.

Looping back to our original points of interest, comparing the p-value calculated by the two-sample t-tests with a 95% confidence interval, we cannot reject the null hypothesis – denoted as \(H0\) : \(\mu 1\) = \(\mu 2\). Our experiment concluded that there wasn’t a remarkable difference in magnitude trends to officially proclaim that the magnitudes of earthquakes does not have statistically significant difference over the years.

Reference

[1] “Earthquake Magnitude, Energy Release, and Shaking Intensity.” Earthquake Magnitude, Energy Release, and Shaking Intensity | U.S. Geological Survey, https://www.usgs.gov/programs/earthquake-hazards/earthquake-magnitude-energy-release-and-shaking-intensity.

[2] Conners, Deanna. “Is the Number of Large Earthquakes Increasing?: Earth.” EarthSky, 11 Nov. 2017, https://earthsky.org/earth/are-large-earthquakes-increasing-in-frequency/.

[3] Data source: https://www.kaggle.com/datasets/usgs/earthquake-database (accessed on 27 April 2022)